Skip to content

Conversation

@lovesegfault
Copy link
Member

Motivation

Nix failed to download files served with Content-Encoding: x-gzip
because libarchive doesn't recognize the legacy x-* compression
format names. Per RFC 9110 §8.4.1.3, HTTP recipients should treat
these as equivalent to their standard counterparts.

Adds normalizeCompressionMethod() to map legacy encoding names
before passing to libarchive:

  • x-gzipgzip
  • x-compresscompress
  • x-bzip2bzip2

Context

Fixes: #14324


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Nix failed to download files served with `Content-Encoding: x-gzip`
because libarchive doesn't recognize the legacy `x-*` compression
format names. Per RFC 9110 §8.4.1.3, HTTP recipients should treat
these as equivalent to their standard counterparts.

Adds `normalizeCompressionMethod()` to map legacy encoding names
before passing to libarchive:
- `x-gzip` → `gzip`
- `x-compress` → `compress`
- `x-bzip2` → `bzip2`
Copy link
Contributor

@xokdvium xokdvium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what we should do is stop using strings to represent enumeration types. We confuse compression algorithm name used by libarxhive and our non-standard Content-Enxoding headers. Those need to become clearly separated

@Mic92
Copy link
Member

Mic92 commented Oct 29, 2025

I think that what we should do is stop using strings to represent enumeration types. We confuse compression algorithm name used by libarxhive and our non-standard Content-Enxoding headers. Those need to become clearly separated

you mean libarchive has enum types for compression?

@xokdvium
Copy link
Contributor

libarchive has enum types for compression?

It doesn't unfortunately, but we really should have our own to wrap around libarchive.

@Ericson2314
Copy link
Member

I agree that making our own enum sounds like the right call.

Copy link
Contributor

@tomberek tomberek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can refactor into enum in another PR.

Copy link
Contributor

@xokdvium xokdvium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that must only affect Content-Encoding parsing. e.g. it must not be possible to specify the deprecated name as the store parameter. Since there's currently no distinction in the code it's a no-go IMO.

@lovesegfault
Copy link
Member Author

fwiw I agree the enum approach is better here, just haven't found the time to do it

@xokdvium
Copy link
Contributor

just haven't found the time to do it

I have some WIP commits for that. In the meantime I don't see a need to rush. This (not accepting deprecated non-standard aliases) is not a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Content-Encoding: x-gzip should be supported

5 participants